STA 364 Time Series Applications

Author
Affiliation

Ch8. Exponential smoothing

?var:college.name
?var:course.number ?var:course.semester

Exponential smoothing

Historical perspective

  • Developed in the 1950s and 1960s as methods (algorithms) to produce point forecasts.
  • Combine a “level”, “trend” (slope) and “seasonal” component to describe a time series.
  • The rate of change of the components are controlled by “smoothing parameters”: \(\alpha\), \(\beta\) and \(\gamma\) respectively.
  • Need to choose best values for the smoothing parameters (and initial states).
  • Equivalent ETS state space models developed in the 1990s and 2000s.

Big idea: control the rate of change

\(\alpha\) controls the flexibility of the level (Level - the average value for a specific time period)

  • If \(\alpha = 0\), the level never updates (mean)
  • If \(\alpha = 1\), the level updates completely (naive)

\(\beta\) controls the flexibility of the trend

  • If \(\beta = 0\), the trend is linear
  • If \(\beta = 1\), the trend changes suddenly every observation

\(\gamma\) controls the flexibility of the seasonality

  • If \(\gamma = 0\), the seasonality is fixed (seasonal means)
  • If \(\gamma = 1\), the seasonality updates completely (seasonal naive)

ETS models

Additive ("A") or multiplicative ("M")

None ("N"), additive ("A"), multiplicative ("M"), or damped ("Ad" or "Md").

None ("N"), additive ("A") or multiplicative ("M")

Simple exponential smoothing

Simple methods

Time series \(y_1,y_2,\dots,y_T\).

  • Want something in between these methods.
  • Most recent data should have more weight.

Simple Exponential Smoothing

Simple Exponential Smoothing

Simple Exponential Smoothing

  • \(\ell_t\) is the level (or the smoothed value) of the series at time t.
  • \(\pred{y}{t+1}{t} = \alpha y_t + (1-\alpha) \pred{y}{t}{t-1}\)

Iterate to get exponentially weighted moving average form.

Optimising smoothing parameters

  • Need to choose best values for \(\alpha\) and \(\ell_0\).
  • Similarly to regression, choose optimal parameters by minimising SSE: \[ \text{SSE}=\sum_{t=1}^T(y_t - \pred{y}{t}{t-1})^2. \]
  • Unlike regression there is no closed form solution — use numerical optimization.
  • For Algerian Exports example:
    • \(\hat\alpha = 0.8400\)
    • \(\hat\ell_0 = 39.54\)

Simple Exponential Smoothing

Models and methods

Methods

  • Algorithms that return point forecasts.

Models

  • Generate same point forecasts but can also generate forecast distributions.
  • A stochastic (or random) data generating process that can generate an entire forecast distribution.
  • Allow for “proper” model selection.

ETS(A,N,N): SES with additive errors

Forecast error: \(e_t = y_t - \pred{y}{t}{t-1} = y_t - \ell_{t-1}\).

Specify probability distribution for \(e_t\), we assume \(e_t = \varepsilon_t\sim\text{NID}(0,\sigma^2)\).

ETS(A,N,N): SES with additive errors

where \(\varepsilon_t\sim\text{NID}(0,\sigma^2)\).

  • “innovations” or “single source of error” because equations have the same error process, \(\varepsilon_t\).
  • Measurement equation: relationship between observations and states.
  • State equation(s): evolution of the state(s) through time.

ETS(M,N,N): SES with multiplicative errors.

  • Specify relative errors \(\varepsilon_t=\frac{y_t-\pred{y}{t}{t-1}}{\pred{y}{t}{t-1}}\sim \text{NID}(0,\sigma^2)\)
  • Substituting \(\pred{y}{t}{t-1}=\ell_{t-1}\) gives:
    • \(y_t = \ell_{t-1}+\ell_{t-1}\varepsilon_t\)
    • \(e_t = y_t - \pred{y}{t}{t-1} = \ell_{t-1}\varepsilon_t\)
  • Models with additive and multiplicative errors with the same parameters generate the same point forecasts but different prediction intervals.

ETS(A,N,N): Specifying the model

ETS(y ~ error("A") + trend("N") + season("N"))

By default, an optimal value for \(\alpha\) and \(\ell_0\) is used.

\(\alpha\) can be chosen manually in trend().

trend("N", alpha = 0.5)
trend("N", alpha_range = c(0.2, 0.8))

Example: Algerian Exports

algeria_economy <- global_economy %>%
  filter(Country == "Algeria")
fit <- algeria_economy %>%
  model(ANN = ETS(Exports ~ error("A") + trend("N") + season("N")))
report(fit)
Series: Exports 
Model: ETS(A,N,N) 
  Smoothing parameters:
    alpha = 0.8399875 

  Initial states:
   l[0]
 39.539

  sigma^2:  35.6301

     AIC     AICc      BIC 
446.7154 447.1599 452.8968 

Example: Algerian Exports

components(fit) %>% autoplot()
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_line()`).

Example: Algerian Exports

components(fit) %>%
  left_join(fitted(fit), by = c("Country", ".model", "Year"))
# A dable: 59 x 7 [1Y]
# Key:     Country, .model [1]
# :        Exports = lag(level, 1) + remainder
   Country .model  Year Exports level remainder .fitted
   <fct>   <chr>  <dbl>   <dbl> <dbl>     <dbl>   <dbl>
 1 Algeria ANN     1959    NA    39.5    NA        NA  
 2 Algeria ANN     1960    39.0  39.1    -0.496    39.5
 3 Algeria ANN     1961    46.2  45.1     7.12     39.1
 4 Algeria ANN     1962    19.8  23.8   -25.3      45.1
 5 Algeria ANN     1963    24.7  24.6     0.841    23.8
 6 Algeria ANN     1964    25.1  25.0     0.534    24.6
 7 Algeria ANN     1965    22.6  23.0    -2.39     25.0
 8 Algeria ANN     1966    26.0  25.5     3.00     23.0
 9 Algeria ANN     1967    23.4  23.8    -2.07     25.5
10 Algeria ANN     1968    23.1  23.2    -0.630    23.8
# ℹ 49 more rows

Example: Algerian Exports

fit %>%
  forecast(h = 5) %>%
  autoplot(algeria_economy) +
  labs(y = "% of GDP", title = "Exports: Algeria")

Models with trend

Holt’s linear trend

  • Two smoothing parameters \(\alpha\) and \(\beta^*\) (\(0\le\alpha,\beta^*\le1\)).
  • \(\ell_t\) level: weighted average between \(y_t\) and one-step ahead forecast for time \(t\), \((\ell_{t-1} + b_{t-1}=\pred{y}{t}{t-1})\)
  • \(b_t\) slope: weighted average of \((\ell_{t} - \ell_{t-1})\) and \(b_{t-1}\), current and previous estimate of slope.
  • Choose \(\alpha, \beta^*, \ell_0, b_0\) to minimise SSE.

ETS(A,A,N)

Holt’s linear method with additive errors.

  • Assume \(\varepsilon_t=y_t-\ell_{t-1}-b_{t-1} \sim \text{NID}(0,\sigma^2)\).
  • Substituting into the error correction equations for Holt’s linear method \[\begin{align*} y_t&=\ell_{t-1}+b_{t-1}+\varepsilon_t\\ \ell_t&=\ell_{t-1}+b_{t-1}+\alpha \varepsilon_t\\ b_t&=b_{t-1}+\alpha\beta^* \varepsilon_t \end{align*}\]
  • For simplicity, set \(\beta=\alpha \beta^*\).

Exponential smoothing: trend/slope

ETS(M,A,N)

Holt’s linear method with multiplicative errors.

  • Assume \(\varepsilon_t=\frac{y_t-(\ell_{t-1}+b_{t-1})}{(\ell_{t-1}+b_{t-1})}\)
  • Following a similar approach as above, the innovations state space model underlying Holt’s linear method with multiplicative errors is specified as \[\begin{align*} y_t&=(\ell_{t-1}+b_{t-1})(1+\varepsilon_t)\\ \ell_t&=(\ell_{t-1}+b_{t-1})(1+\alpha \varepsilon_t)\\ b_t&=b_{t-1}+\beta(\ell_{t-1}+b_{t-1}) \varepsilon_t \end{align*}\] where again \(\beta=\alpha \beta^*\) and \(\varepsilon_t \sim \text{NID}(0,\sigma^2)\).

ETS(A,A,N): Specifying the model

ETS(y ~ error("A") + trend("A") + season("N"))

By default, optimal values for \(\beta\) and \(b_0\) are used.

\(\beta\) can be chosen manually in trend().

trend("A", beta = 0.004)
trend("A", beta_range = c(0, 0.1))

Example: Australian population

aus_economy <- global_economy %>% filter(Code == "AUS") %>%
  mutate(Pop = Population / 1e6)
fit <- aus_economy %>%
  model(AAN = ETS(Pop ~ error("A") + trend("A") + season("N")))
report(fit)
Series: Pop 
Model: ETS(A,A,N) 
  Smoothing parameters:
    alpha = 0.9999 
    beta  = 0.3266366 

  Initial states:
     l[0]      b[0]
 10.05414 0.2224818

  sigma^2:  0.0041

      AIC      AICc       BIC 
-76.98569 -75.83184 -66.68347 

Example: Australian population

components(fit) %>% autoplot()
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_line()`).

Example: Australian population

components(fit) %>%
  left_join(fitted(fit), by = c("Country", ".model", "Year"))
# A dable: 59 x 8 [1Y]
# Key:     Country, .model [1]
# :        Pop = lag(level, 1) + lag(slope, 1) + remainder
   Country   .model  Year   Pop level slope remainder .fitted
   <fct>     <chr>  <dbl> <dbl> <dbl> <dbl>     <dbl>   <dbl>
 1 Australia AAN     1959  NA    10.1 0.222 NA           NA  
 2 Australia AAN     1960  10.3  10.3 0.222 -0.000145    10.3
 3 Australia AAN     1961  10.5  10.5 0.217 -0.0159      10.5
 4 Australia AAN     1962  10.7  10.7 0.231  0.0418      10.7
 5 Australia AAN     1963  11.0  11.0 0.223 -0.0229      11.0
 6 Australia AAN     1964  11.2  11.2 0.221 -0.00641     11.2
 7 Australia AAN     1965  11.4  11.4 0.221 -0.000314    11.4
 8 Australia AAN     1966  11.7  11.7 0.235  0.0418      11.6
 9 Australia AAN     1967  11.8  11.8 0.206 -0.0869      11.9
10 Australia AAN     1968  12.0  12.0 0.208  0.00350     12.0
# ℹ 49 more rows

Example: Australian population

fit %>%
  forecast(h = 10) %>%
  autoplot(aus_economy) +
  labs(y = "Millions", title = "Population: Australia")

Damped trend method

  • Damping parameter \(0<\phi<1\).
  • If \(\phi=1\), identical to Holt’s linear trend.
  • As \(h\rightarrow\infty\), \(\pred{y}{T+h}{T}\rightarrow \ell_T+\phi b_T/(1-\phi)\).
  • Short-run forecasts trended, long-run forecasts constant.

Example: Australian population

aus_economy %>%
  model(holt = ETS(Pop ~ error("A") + trend("Ad") + season("N"))) %>%
  forecast(h = 20) %>%
  autoplot(aus_economy)

Example: Australian population

fit <- aus_economy %>%
  filter(Year <= 2010) %>%
  model(
    ses = ETS(Pop ~ error("A") + trend("N") + season("N")),
    holt = ETS(Pop ~ error("A") + trend("A") + season("N")),
    damped = ETS(Pop ~ error("A") + trend("Ad") + season("N"))
  )
tidy(fit)
accuracy(fit)

Example: Australian population

Warning: The future dataset is incomplete, incomplete out-of-sample data will be treated as missing. 
2 observations are missing between 2018 and 2019
term SES Linear trend Damped trend
\(\alpha\) 1.00 1.00 1.00
\(\beta^*\) 0.30 0.40
\(\phi\) 0.98
NA 0.22 0.25
NA 10.28 10.05 10.04
Training RMSE 0.24 0.06 0.07
Test RMSE 1.63 0.15 0.21
Test MASE 6.18 0.55 0.75
Test MAPE 6.09 0.55 0.74
Test MAE 1.45 0.13 0.18

Models with seasonality

Holt-Winters additive method

Holt and Winters extended Holt’s method to capture seasonality.
  • \(k=\) integer part of \((h-1)/m\). Ensures estimates from the final year are used for forecasting.
  • Parameters:  \(0\le \alpha\le 1\)\(0\le \beta^*\le 1\)\(0\le \gamma\le 1-\alpha\)  and \(m=\) period of seasonality (e.g. \(m=4\) for quarterly data).

Holt-Winters additive method

  • Seasonal component is usually expressed as \(s_{t} = \gamma^* (y_{t}-\ell_{t})+ (1-\gamma^*)s_{t-m}.\)
  • Substitute in for \(\ell_t\): \(s_{t} = \gamma^*(1-\alpha) (y_{t}-\ell_{t-1}-b_{t-1})+ [1-\gamma^*(1-\alpha)]s_{t-m}\)
  • We set \(\gamma=\gamma^*(1-\alpha)\).
  • The usual parameter restriction is \(0\le\gamma^*\le1\), which translates to \(0\le\gamma\le(1-\alpha)\).

Exponential smoothing: seasonality

ETS(A,A,A)

Holt-Winters additive method with additive errors.

  • Forecast errors: \(\varepsilon_{t} = y_t - \hat{y}_{t|t-1}\)
  • \(k\) is integer part of \((h-1)/m\).

Holt-Winters multiplicative method

Seasonal variations change in proportion to the level of the series.

  • \(k\) is integer part of \((h-1)/m\).
  • Additive method: \(s_t\) in absolute terms — within each year \(\sum_i s_i \approx 0\).
  • Multiplicative method: \(s_t\) in relative terms — within each year \(\sum_i s_i \approx m\).

ETS(M,A,M)

Holt-Winters multiplicative method with multiplicative errors.

  • Forecast errors: \(\varepsilon_{t} = (y_t - \hat{y}_{t|t-1})/\hat{y}_{t|t-1}\)
  • \(k\) is integer part of \((h-1)/m\).

Example: Australian holiday tourism

aus_holidays <- tourism %>%
  filter(Purpose == "Holiday") %>%
  summarise(Trips = sum(Trips))
fit <- aus_holidays %>%
  model(
    additive = ETS(Trips ~ error("A") + trend("A") + season("A")),
    multiplicative = ETS(Trips ~ error("M") + trend("A") + season("M"))
  )
fc <- fit %>% forecast()

Example: Australian holiday tourism

fc %>%
  autoplot(aus_holidays, level = NULL) +
  labs(y = "Thousands", title = "Overnight trips")

Estimated components

components(fit)
# A dable: 168 x 7 [1Q]
# Key:     .model [2]
# :        Trips = lag(level, 1) + lag(slope, 1) + lag(season, 4) + remainder
   .model   Quarter  Trips level slope season remainder
   <chr>      <qtr>  <dbl> <dbl> <dbl>  <dbl>     <dbl>
 1 additive 1997 Q1    NA    NA   NA    1512.      NA  
 2 additive 1997 Q2    NA    NA   NA    -290.      NA  
 3 additive 1997 Q3    NA    NA   NA    -684.      NA  
 4 additive 1997 Q4    NA  9899. -37.4  -538.      NA  
 5 additive 1998 Q1 11806. 9964. -24.5  1512.     433. 
 6 additive 1998 Q2  9276. 9851. -35.6  -290.    -374. 
 7 additive 1998 Q3  8642. 9700. -50.2  -684.    -489. 
 8 additive 1998 Q4  9300. 9694. -44.6  -538.     188. 
 9 additive 1999 Q1 11172. 9652. -44.3  1512.      10.7
10 additive 1999 Q2  9608. 9676. -35.6  -290.     290. 
# ℹ 158 more rows

Estimated components

Warning: Removed 4 rows containing missing values or values outside the scale range
(`geom_line()`).
Removed 4 rows containing missing values or values outside the scale range
(`geom_line()`).
Figure 1

Holt-Winters damped method

Often the single most accurate forecasting method for seasonal data:

Holt-Winters with daily data

sth_cross_ped <- pedestrian %>%
  filter(
    Date >= "2016-07-01",
    Sensor == "Southern Cross Station"
  ) %>%
  index_by(Date) %>%
  summarise(Count = sum(Count) / 1000)
sth_cross_ped %>%
  filter(Date <= "2016-07-31") %>%
  model(
    hw = ETS(Count ~ error("M") + trend("Ad") + season("M"))
  ) %>%
  forecast(h = "2 weeks") %>%
  autoplot(sth_cross_ped %>% filter(Date <= "2016-08-14")) +
  labs(
    title = "Daily traffic: Southern Cross",
    y = "Pedestrians ('000)"
  )

Holt-Winters with daily data

Innovations state space models

Exponential smoothing methods

ETS models

Additive error models

Multiplicative error models

Estimating ETS models

  • Smoothing parameters \(\alpha\), \(\beta\), \(\gamma\) and \(\phi\), and the initial states \(\ell_0\), \(b_0\), \(s_0,s_{-1},\dots,s_{-m+1}\) are estimated by maximising the “likelihood” = the probability of the data arising from the specified model.
  • For models with additive errors equivalent to minimising SSE.
  • For models with multiplicative errors, equivalent to minimising SSE.

Innovations state space models

  • Estimate parameters \(\bm\theta = (\alpha,\beta,\gamma,\phi)\) and initial states \(\bm{x}_0 = (\ell_0,b_0,s_0,s_{-1},\dots,s_{-m+1})\) by minimizing \(L^*\).

Parameter restrictions

Usual region

  • Traditional restrictions in the methods \(0< \alpha,\beta^*,\gamma^*,\phi<1\)(equations interpreted as weighted averages).
  • In models we set \(\beta=\alpha\beta^*\) and \(\gamma=(1-\alpha)\gamma^*\).
  • Therefore \(0< \alpha <1\),    \(0 < \beta < \alpha\)    and \(0< \gamma < 1-\alpha\).
  • \(0.8<\phi<0.98\) — to prevent numerical difficulties.

Admissible region

  • To prevent observations in the distant past having a continuing effect on current forecasts.
  • Usually (but not always) less restrictive than region.
  • For example for ETS(A,N,N): \(0< \alpha <1\) while \(0< \alpha <2\).

Model selection

where \(L\) is the likelihood and \(k\) is the number of parameters initial states estimated in the model.

which is the AIC corrected (for small sample bias).

AIC and cross-validation

Automatic forecasting

From Hyndman et al. (IJF, 2002):

  • Apply each model that is appropriate to the data. Optimize parameters and initial values using MLE (or some other criterion).
  • Select best method using AICc:
  • Produce forecasts using best method.
  • Obtain forecast intervals using underlying state space model.

Method performed very well in M3 competition.

Example: National populations

fit <- global_economy %>%
  mutate(Pop = Population / 1e6) %>%
  model(ets = ETS(Pop))
Warning: 1 error encountered for ets
[1] ETS does not support missing values.
fit
# A mable: 263 x 2
# Key:     Country [263]
   Country                      ets
   <fct>                    <model>
 1 Afghanistan         <ETS(A,A,N)>
 2 Albania             <ETS(M,A,N)>
 3 Algeria             <ETS(M,A,N)>
 4 American Samoa      <ETS(M,A,N)>
 5 Andorra             <ETS(M,A,N)>
 6 Angola              <ETS(M,A,N)>
 7 Antigua and Barbuda <ETS(M,A,N)>
 8 Arab World          <ETS(M,A,N)>
 9 Argentina           <ETS(A,A,N)>
10 Armenia             <ETS(M,A,N)>
# ℹ 253 more rows

Example: National populations

fit %>%
  forecast(h = 5)
# A fable: 1,315 x 5 [1Y]
# Key:     Country, .model [263]
   Country  .model  Year             Pop
   <fct>    <chr>  <dbl>          <dist>
 1 Afghani… ets     2018    N(36, 0.012)
 2 Afghani… ets     2019    N(37, 0.059)
 3 Afghani… ets     2020     N(38, 0.16)
 4 Afghani… ets     2021     N(39, 0.35)
 5 Afghani… ets     2022     N(40, 0.64)
 6 Albania  ets     2018 N(2.9, 0.00012)
 7 Albania  ets     2019   N(2.9, 6e-04)
 8 Albania  ets     2020  N(2.9, 0.0017)
 9 Albania  ets     2021  N(2.9, 0.0036)
10 Albania  ets     2022  N(2.9, 0.0066)
# ℹ 1,305 more rows
# ℹ 1 more variable: .mean <dbl>

Example: Australian holiday tourism

holidays <- tourism %>%
  filter(Purpose == "Holiday")
fit <- holidays %>% model(ets = ETS(Trips))
fit
# A mable: 76 x 4
# Key:     Region, State, Purpose [76]
   Region                     State              Purpose          ets
   <chr>                      <chr>              <chr>        <model>
 1 Adelaide                   South Australia    Holiday <ETS(A,N,A)>
 2 Adelaide Hills             South Australia    Holiday <ETS(A,A,N)>
 3 Alice Springs              Northern Territory Holiday <ETS(M,N,A)>
 4 Australia's Coral Coast    Western Australia  Holiday <ETS(M,N,A)>
 5 Australia's Golden Outback Western Australia  Holiday <ETS(M,N,M)>
 6 Australia's North West     Western Australia  Holiday <ETS(A,N,A)>
 7 Australia's South West     Western Australia  Holiday <ETS(M,N,M)>
 8 Ballarat                   Victoria           Holiday <ETS(M,N,A)>
 9 Barkly                     Northern Territory Holiday <ETS(A,N,A)>
10 Barossa                    South Australia    Holiday <ETS(A,N,N)>
# ℹ 66 more rows

Example: Australian holiday tourism

fit %>%
  filter(Region == "Snowy Mountains") %>%
  report()
Series: Trips 
Model: ETS(M,N,A) 
  Smoothing parameters:
    alpha = 0.1571013 
    gamma = 0.0001000991 

  Initial states:
     l[0]      s[0]    s[-1]     s[-2]     s[-3]
 141.6782 -60.95904 130.8567 -42.23776 -27.65986

  sigma^2:  0.0388

     AIC     AICc      BIC 
852.0452 853.6008 868.7194 

Example: Australian holiday tourism

fit %>%
  filter(Region == "Snowy Mountains") %>%
  components(fit)
# A dable: 84 x 9 [1Q]
# Key:     Region, State, Purpose, .model [1]
# :        Trips = (lag(level, 1) + lag(season, 4)) * (1 + remainder)
   Region          State           Purpose .model Quarter Trips level season remainder
   <chr>           <chr>           <chr>   <chr>    <qtr> <dbl> <dbl>  <dbl>     <dbl>
 1 Snowy Mountains New South Wales Holiday ets    1997 Q1  NA     NA   -27.7   NA     
 2 Snowy Mountains New South Wales Holiday ets    1997 Q2  NA     NA   -42.2   NA     
 3 Snowy Mountains New South Wales Holiday ets    1997 Q3  NA     NA   131.    NA     
 4 Snowy Mountains New South Wales Holiday ets    1997 Q4  NA    142.  -61.0   NA     
 5 Snowy Mountains New South Wales Holiday ets    1998 Q1 101.   140.  -27.7   -0.113 
 6 Snowy Mountains New South Wales Holiday ets    1998 Q2 112.   142.  -42.2    0.154 
 7 Snowy Mountains New South Wales Holiday ets    1998 Q3 310.   148.  131.     0.137 
 8 Snowy Mountains New South Wales Holiday ets    1998 Q4  89.8  148.  -61.0    0.0335
 9 Snowy Mountains New South Wales Holiday ets    1999 Q1 112.   147.  -27.7   -0.0687
10 Snowy Mountains New South Wales Holiday ets    1999 Q2 103.   147.  -42.2   -0.0199
# ℹ 74 more rows

Example: Australian holiday tourism

fit %>%
  filter(Region == "Snowy Mountains") %>%
  components(fit) %>%
  autoplot()
Warning: Removed 4 rows containing missing values or values outside the scale range
(`geom_line()`).

Example: Australian holiday tourism

fit %>% forecast()
# A fable: 608 x 7 [1Q]
# Key:     Region, State, Purpose, .model [76]
   Region         State           Purpose .model Quarter
   <chr>          <chr>           <chr>   <chr>    <qtr>
 1 Adelaide       South Australia Holiday ets    2018 Q1
 2 Adelaide       South Australia Holiday ets    2018 Q2
 3 Adelaide       South Australia Holiday ets    2018 Q3
 4 Adelaide       South Australia Holiday ets    2018 Q4
 5 Adelaide       South Australia Holiday ets    2019 Q1
 6 Adelaide       South Australia Holiday ets    2019 Q2
 7 Adelaide       South Australia Holiday ets    2019 Q3
 8 Adelaide       South Australia Holiday ets    2019 Q4
 9 Adelaide Hills South Australia Holiday ets    2018 Q1
10 Adelaide Hills South Australia Holiday ets    2018 Q2
# ℹ 598 more rows
# ℹ 2 more variables: Trips <dist>, .mean <dbl>

Example: Australian holiday tourism

fit %>% forecast() %>%
  filter(Region == "Snowy Mountains") %>%
  autoplot(holidays) +
  labs(y = "Thousands", title = "Overnight trips")

Residuals

Response residuals

\[\hat{e}_t = y_t - \hat{y}_{t|t-1}\]

Innovation residuals

Additive error model: \[\hat\varepsilon_t = y_t - \hat{y}_{t|t-1}\]

Multiplicative error model: \[\hat\varepsilon_t = \frac{y_t - \hat{y}_{t|t-1}}{\hat{y}_{t|t-1}}\]

Example: Australian holiday tourism

aus_holidays <- tourism %>%
  filter(Purpose == "Holiday") %>%
  summarise(Trips = sum(Trips))
fit <- aus_holidays %>%
  model(ets = ETS(Trips)) %>%
  report()
Series: Trips 
Model: ETS(M,N,M) 
  Smoothing parameters:
    alpha = 0.3578226 
    gamma = 0.0009685565 

  Initial states:
     l[0]      s[0]     s[-1]    s[-2]    s[-3]
 9666.501 0.9430367 0.9268433 0.968352 1.161768

  sigma^2:  0.0022

     AIC     AICc      BIC 
1331.372 1332.928 1348.046 

Example: Australian holiday tourism

residuals(fit)
residuals(fit, type = "response")

Example: Australian holiday tourism

fit %>%
  augment()
# A tsibble: 80 x 6 [1Q]
# Key:       .model [1]
   .model Quarter  Trips .fitted .resid   .innov
   <chr>    <qtr>  <dbl>   <dbl>  <dbl>    <dbl>
 1 ets    1998 Q1 11806.  11230.  576.   0.0513 
 2 ets    1998 Q2  9276.   9532. -257.  -0.0269 
 3 ets    1998 Q3  8642.   9036. -393.  -0.0435 
 4 ets    1998 Q4  9300.   9050.  249.   0.0275 
 5 ets    1999 Q1 11172.  11260.  -88.0 -0.00781
 6 ets    1999 Q2  9608.   9358.  249.   0.0266 
 7 ets    1999 Q3  8914.   9042. -129.  -0.0142 
 8 ets    1999 Q4  9026.   9154. -129.  -0.0140 
 9 ets    2000 Q1 11071.  11221. -150.  -0.0134 
10 ets    2000 Q2  9196.   9308. -111.  -0.0120 
# ℹ 70 more rows

Some unstable models

  • Some of the combinations of (Error, Trend, Seasonal) can lead to numerical difficulties; see equations with division by a state.
  • These are: ETS(A,N,M), ETS(A,A,M), ETS(A,A,M).
  • Models with multiplicative errors are useful for strictly positive data, but are not numerically stable with data containing zeros or negative values. In that case only the six fully additive models will be applied.

Exponential smoothing models

Forecasting with exponential smoothing

Forecasting with ETS models

iterate the equations for \(t=T+1,T+2,\dots,T+h\) and set all \(\varepsilon_t=0\) for \(t>T\).

  • Not the same as \(\text{E}(y_{t+h} | \bm{x}_t)\) unless seasonality is additive.
  • fable uses \(\text{E}(y_{t+h} | \bm{x}_t)\).
  • Point forecasts for ETS(A,*,*) are identical to ETS(M,*,*) if the parameters are the same.

Forecasting with ETS models

can only be generated using the models.

  • The prediction intervals will differ between models with additive and multiplicative errors.
  • Exact formulae for some models.
  • More general to simulate future sample paths, conditional on the last estimate of the states, and to obtain prediction intervals from the percentiles of these simulated future paths.

Prediction intervals

Example: Corticosteroid drug sales

h02 <- PBS %>%
  filter(ATC2 == "H02") %>%
  summarise(Cost = sum(Cost))
h02 %>% autoplot(Cost)

Example: Corticosteroid drug sales

h02 %>%
  model(ETS(Cost)) %>%
  report()
Series: Cost 
Model: ETS(M,Ad,M) 
  Smoothing parameters:
    alpha = 0.3071016 
    beta  = 0.0001006793 
    gamma = 0.0001007181 
    phi   = 0.977528 

  Initial states:
     l[0]    b[0]      s[0]     s[-1]     s[-2]     s[-3]     s[-4]    s[-5]    s[-6]
 417268.7 8205.82 0.8716807 0.8259747 0.7562808 0.7733338 0.6872373 1.283821 1.324616
    s[-7]    s[-8]    s[-9]   s[-10]    s[-11]
 1.180067 1.163601 1.104801 1.047963 0.9806235

  sigma^2:  0.0046

     AIC     AICc      BIC 
5515.212 5518.909 5574.938 

Example: Corticosteroid drug sales

h02 %>%
  model(ETS(Cost ~ error("A") + trend("A") + season("A"))) %>%
  report()
Series: Cost 
Model: ETS(A,A,A) 
  Smoothing parameters:
    alpha = 0.1702163 
    beta  = 0.006310854 
    gamma = 0.4545987 

  Initial states:
     l[0]     b[0]      s[0]     s[-1]     s[-2]     s[-3]     s[-4]    s[-5]    s[-6]
 409705.9 9097.111 -99075.37 -136602.3 -191496.1 -174530.8 -241436.7 210643.8 244644.2
    s[-7]    s[-8]    s[-9]  s[-10]    s[-11]
 145368.2 130569.6 84457.69 39131.7 -11673.71

  sigma^2:  3498869384

     AIC     AICc      BIC 
5585.278 5588.568 5641.686 

Example: Corticosteroid drug sales

h02 %>%
  model(ETS(Cost)) %>%
  forecast() %>%
  autoplot(h02)

Example: Corticosteroid drug sales

h02 %>%
  model(
    auto = ETS(Cost),
    AAA = ETS(Cost ~ error("A") + trend("A") + season("A"))
  ) %>%
  accuracy()
Model MAE RMSE MAPE MASE RMSSE
auto 38649.04 51102.24 4.988983 0.6375806 0.6891173
AAA 43378.40 56784.23 6.047574 0.7155993 0.7657394